Method and apparatus for audio processing

ABSTRACT

A method and apparatus for introducing a time-varying time delay or phase shift randomly into the individual reproduction channels of a sound recording, two in the case of binaural presentation. This emulates the temporal aspect of microphone and/or listener motion. The present invention may be applied as a unidirectional process. No preparation of the source material is required. It can be applied to any multichannel audio signal set. It can process analog or digital signals. The process may be used with headphones, loudspeakers, hearing aids or similar assistive hearing devices.

This application is a continuation of U.S. patent application Ser. No.14/589,341, filed Jan. 5, 2015, which is a continuation of U.S. patentapplication Ser. No. 14/109,223, filed Dec. 17, 2013, issued as U.S.Pat. No. 8,929,560, which is a continuation of U.S. patent applicationSer. No. 12/193,036, filed Aug. 17, 2008, issued as U.S. Pat. No.8,611,557, which claims priority to Provisional Patent Application No.60/956,584, filed Aug. 17, 2007, entitled “Method and Process for AudioProcessing,” and is entitled to those filing dates, in whole or in part,for priority. The complete disclosures, specifications, drawings andattachments of Provisional Patent Application No. 60/956,584 and U.S.patent application Ser. Nos. 12/193,036 and 14/109,223 and 14/589,341 reincorporated herein in their entireties for all purposes by specificreference.

FIELD OF INVENTION

This invention relates to a method and process of processing audiosignals for the purpose of improved recognition of timbre. Moreparticularly, this invention relates to a method and process fortemporally modifying audio signals by simulation of missing reverberantcues.

BACKGROUND OF INVENTION

Timbre is generally defined as the tonal identity of a sound. It is theattribute that distinguishes a sound from other sounds of the same pitchand intensity. While the term is most commonly used in a musicalconnotation, timbre is important in other ways because it is afundamental aspect of the importance of a sound in the hierarchy ofthreat or alarm.

In the presentation of music, it can be far more important to quicklyidentify what the sound is than where it is. This distinction is bothintellectual and intuitive; intellectually, timbre is critical to beingable to unravel the musical texture in order to understand it.Intuitively, timbre is a fundamental input to the limbic nervous systemwhich is the seat of emotional response. If timbre cannot be quicklyperceived, then the musical texture cannot be decoded, nor can anemotional response be elicited. Conscious effort to “understand” thesound impedes the possibility of viscerally reacting to it. The abilityto viscerally react to music is an important element of therapeuticeffectiveness in music therapy. Basically, improvement in timbreperception allows the conscious thought process to be bypassed.

When a recording is made with the microphones or the performers (orboth) in motion, upon playback musical timbre can be more quicklyidentified. It is hypothesized that this is due to an interaction withhuman hearing which allows a spatial average energy spectrum to bedeveloped by a process which is in lieu of, or possibly in addition to,the usual averaging of reflections by the human neurophysiologicalsystem.

This effect is particularly apparent in headphone (binaural)reproduction. Presumably this is because in normal (non-headphone)listening to either live or reproduced sound, there are small headmotions of the listener constantly occurring. And with loudspeakers,even though listener's head may be able to make small movements, thesource of the sound is fixed. This may enable the listener to developthe aforementioned spatial average estimate of the energy spectrum. Inheadphone listening, however, this mechanism is not available becausethere is no relative motion possible between the listener's ears and thesound source. There also are several other problems associated withbinaural presentation, chief among which is the sensation that the soundimage is in the middle of one's head. Also there are questionsconcerning the basic frequency response as it relates to diffuse-fieldversus direct field equalization.

Accordingly, what is needed is a method to process audio signals torestore or simulate this perceptual mechanism with the use of headphonesor loudspeakers.

SUMMARY OF THE INVENTION

In various embodiments, the present invention introduces temporalvariation in the effective path from the musician to the listener to aidin perception of timbre. Modification of the electrical or acousticalphase of a signal is the same as a time variation (i.e., phase is time).In addition, a wave propagating in a medium requires a particular amountof time to travel a particular distance; hence, time also is distance.It follows that phase is (or can be correlated to) distance.

In one exemplary embodiment, the present invention introduces atime-varying time delay randomly introduced into the individualreproduction channels, two in the case of binaural presentation. Thisemulates the temporal aspect of microphone and/or listener motion. Thepresent invention may be applied as a unidirectional process. Nopreparation of the source material is required. It can be applied to anymultichannel audio signal set. It can process analog or digital signals.

DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of a fixed phase shifter circuit in accordance withan exemplary embodiment of the present invention.

FIG. 2 is a diagram of a variable phase shifter circuit in accordancewith an exemplary embodiment of the present invention.

FIG. 3 is a diagram of an analog audio processing system in accordancewith an exemplary embodiment of the present invention.

FIG. 4 is a diagram of an analog and digital audio processing system inaccordance with an exemplary embodiment of the present invention.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

In one exemplary embodiment, the present invention enhances theperception of timbre, or tonal identity, by temporal processing of arecording. The recording may be a fixed-microphone recording. Therecording can be analog or digital. While the enhancement of theperception of timbre may be accomplished by introducing a time-varyingtime delay, it may also be accomplished by suitable phase shifting.

A sound traveling in a medium (e.g., air) has a wavelength which isinversely proportional to its frequency. The velocity of propagation(e.g., distance/unit time) in the medium is constant, therefore a givennumber of degrees (e.g., phase angle) of wave movement requires anamount of time which is also inversely proportional to frequency. Thusphase and time and distance are related.

Whether the time delays are implemented as pure delay or as phaseshifting, it is necessary to make a quantitative estimate of the amountof delay which is required. A motion of the microphones of, say, 0.2 mwould be represented by a time shift of about 600 microseconds, usingthe formula T=r/c, where c=speed of sound=354 m/s, and r=distance in m.

In one embodiment, the method of the present invention introduces arandom time-varying phase shift, which is free of discontinuities,independently into the channels of a stereophonic electrical signalpath. For example, a time-varying phase shift is introducedindependently and randomly into the two channels of a stereophonicsignal path. The method is not necessarily limited to two channels. Theresult emulates at least one aspect of the continuous movement of therecording microphones mentioned above.

At middle frequencies, 1 kHz, 600 usec corresponds to 216 degrees ofphase delay. An example of a fixed phase shifting circuit is illustratedin FIG. 1, where R1 through R3 are resistors and C is a capacitor 32.The circuit further comprises an operational amplifier 30. Theresistance values may vary. In one exemplary embodiment, the values ofR1 and R2 are equal or approximately equal. Such a circuit will producephase shift of 0-180 degrees or 180-360 degrees depending on how it isconfigured. A relatively uniform delay of 600 usec requires 2160 degreesat 10 kHz, so a cascade of such phase shifters is required.Experimentally, it is not necessary to preserve a constant delay time atall frequencies. This can lead to a reduction in the number of stagesrequired.

In one embodiment, the phase shifter circuit should be variableaccording to some external control parameter. In FIG. 2, an embodimentof a variable phase shifter circuit is shown in which an externalcurrent 40 controls the phase shift by means of a light-emitting diode42 which impacts a light dependent resistor 44 so that the resistancevaries with varying light emission from the LED. A common voltagecontrolling several such circuit elements in a cascade produces therequired controllable phase-shifter.

Other higher-order (i.e. quadratic) phase shifters could be used. Evenanalog charge-coupled delay lines could be used with a time-varyingclock.

In yet another embodiment, the invention comprises a goniometer, acircuit or device that changes phase continuously, i.e., not in steps.Effectively, the circuit is a phase modulator with two inputs: amodulation input and a signal input. There may be one such goniometer ineach signal channel. The modulation input to each goniometer is anindependent source of random noise in a control bandwidth chosen tosimulate a physically possible movement of the microphones on the orderof 0.1 Hz to 1 Hz.

FIG. 3 shows one embodiment of an analog audio processor in accordancewith the present invention for a two-channel system. The analog audiosignals 2, 12 are applied to two corresponding phasemodulator/goniometer circuits or devices 4, 14. These goniometers may bevoltage-controlled phase shifters as described above. Two random noiseor number generators 6, 16 with suitable low-pass filters 8, 18 providethe random control function.

In a digital embodiment, the audio signal is first digitized and thenpassed in each channel though a delay which is phase-continuously variedaccording to a random law at an appropriate rate. This technique issimilar to that used in direct-digital-synthesis oscillators. The signalis then reconverted to analog for presentation via headphones orloudspeakers. It should be understood that variation in the phase ortime delays, the rate or law controlling such delays and the exactcircuit embodiments may vary.

FIG. 4 shows an embodiment for a system that can process digital soundrecordings and analog sound recordings. The input can be two analogsignals 2, 12 which are converted to digital by digital/analogconverters, or a digital input 22 which may be multiplexed. Theapplication of pure delay is straightforward, using goniometer circuitsas described above with digital random number generators 26. The delaymay be smoothly varied. For example, a DDS clock with continuous phaseinterpolation can be used to operate a delay memory with the process ata sufficiently high rate that discontinuities will be absorbed in theoutput reconstruction filters. Output may be digital 29, or may beconverted to analog by digital/analog converters 27 in each channel.

The control function is a random or pseudo-random time-varying quantitywhich controls the phase shifters or delay lines. The rate of variationin this embodiment should be in the range of probable motions of thelistener or the microphones. Also, the rate of variation should be lowenough that any phase-modulation sidebands will lie below the audiorange so as to avoid the intrusion of low-frequency noise. In oneexemplary embodiment, a control bandwidth of about 10 Hz is chosen.Because the bandwidth is so low, the random control function could beequally well generated by a true random noise source 6, 16, or by arandom-number generator, with a suitable low-pass filter 8, 18.

In another embodiment, the phase/time variation should be smooth. Stepdiscontinuities may produce audible artifacts. The range of the phasevariation is adjustable. The variation should be free of patterns; thatis, truly random and not cyclic.

Accordingly, the present invention restores the lost perceptualmechanism derived from relative motions between the source and thelistener. The quickness of timbre recognition also may lead to animprovement in intelligibility of all signal types. This comports withthe principles of quantitative intelligibility measures such as theSpeech Transmission Index which deal with preservation of the infrasonicamplitude modulation transfer function.

Another area of binaural reproduction is the perception of the locationof sounds in both azimuth and elevation. This is important invirtual-reality presentations and in information delivery systems, suchas fighter plane cockpits. These systems usually concern themselves withstereotactic detection of head position, eye-motion tracking or othermeasures of directional attention in order to process audio messages inamplitude and phase to force the auditory image to be congruent withhead position or visual attention.

The methods and processes of the present invention can be combined withthese processes. For example, one way the “in the head” problem inbinaural listening can be addressed is by filtering and cross-feedingthe left and right signals according to generalized head-relatedtransfer functions (HRTF). The HRTF models the propagation of soundaround the head from ear-to-ear for external sound sources. This isanother example of a process which is applied to replace anaturally-occurring aspect of hearing when binaural presentation isinvolved. The HRTF may be dynamically modified with a variable delay asdescribed above.

The method and processes of the present invention also may be combinedwith assistive hearing devices, such as hearing aids, to improveintelligibility of what is heard through improved recognition of timbre.

Thus, it should be understood that the embodiments and examplesdescribed herein have been chosen and described in order to bestillustrate the principles of the invention and its practicalapplications to thereby enable one of ordinary skill in the art to bestutilize the invention in various embodiments and with variousmodifications as are suited for particular uses contemplated. Eventhough specific embodiments of this invention have been described, theyare not to be taken as exhaustive. There are several variations thatwill be apparent to those skilled in the art.

We claim:
 1. A method for modifying an audio signal, comprising thesteps of: introducing a time delay into an audio signal input to producea modified audio signal, wherein said modified audio signal emulates thetemporal aspect of relative motion by a source.
 2. The method of claim1, wherein the audio signal input is analog or digital.
 3. The method ofclaim 1, wherein there are multiple audio signals, and a separate timedelay is introduced into each signal.
 4. The method of claim 1, furthercomprising the step of outputting the modified audio signal through asound reproduction device.
 5. The method of claim 4, wherein the soundreproduction device comprises one or more of headphones, an in-earreceiver, earbud, or a hearing aid, or combinations thereof
 6. Themethod of claim 1, wherein the modified audio signals are output to atleast one loudspeaker.